|
The Elementary Cognitive Agent Classes Ray Brown, Ph.D. |
||||
|
The most elementary cognitive agents use simple algorithms to perform functions requested by complex agents which have been assigned a task within a group or agency. These fifteen operations are listed in the column on the right. They denote fifteen classes of Applied Chaos elementary agents, defined as follows: "Classify" is defined as separating entities into pre established classes defined by attributes, purposes, functions, or other properties. "Grouping" is defined as bring a set of entities together to be treated as similar entities or as single unit. "Ordering" is to is defined as placing entities into a linear arrangement which may have various meanings such as larger, better, more suitable for a purpose, etc. "Linking" is defined as associating two or more entities in a relationship. "Predicting" is defined as inferring future events or inferring past causes of present events. "Clarifying" is defined as removing any ambiguities or masks that may prohibit understanding. "Searching" is is defined as looking for entities having stated attributes or properties. "Comparing" consists in making a check of attributes between two entities. "Selecting" is defined as making a choice between alternatives."Valuing" is the act of assigning a numerical value to an entity. "Examining" is defined as determining whether a given entity fits a criteria. "Abstracting" is defined as the determination of the generic or general properties of an object. "Analyzing" is defined as decomposing an entity into simpler parts. "Synthesizing" is defined as reassembling components into a larger more meaningful entity."Comparing" is listing the similarities and differences of two or more entities. "Forming analogies" is inferring additional, but previously unknown, similarities between two or more entities based on a known subset of similarities. Classifying The simplest form of classifying is to assign the elements of a set to one of two values, typically 1 or 0. While neural networks were designed forthis task, they may be needlessly large and complex for many tasks. Hence, understanding calssification at the most basic level is necessary to constrct the simplest and smallest footprint algorithm to be used in an agency. The following example will serve to illustrate this point.
This example shows that by working from first principles, very simple, small foot print algorithms (agents) can be constructed as classifiers. Also, from this example, classifiers having far more complex dynamics can be constructed which are still very simple with small footpriints. Grouping Grouping algorithms can be constructed using a metric. The basic idea is to identify clusters of values so that they may be treated similarly, or to identify trends. A simple approach is to start with an arbitray partition of the data by square neighborhoods of equal size. The measure of any element of the partition is the number of points in the partition divided by the total number of points in the sample. From this we compute the mathematical entropy of the partition and normalize it to the maximum entropy of the partion for Lesbegue measure, or more generally Haar measure. When this value is less than 1, we have an immediate indication of the existence of clusters. This allows us to focus attention on the elements of the partition in which a potential cluster lies. Alternatively, one can proceed by sequential analysis comparing the number of points in an element of the partition to the expected number for a uniform distribution. Once a potential cluster is found, the partition may be refined or one may compute the mean of the points within that element and then the variance from the mean (a distance function). If the variance lies below a given threshold, we form a new partition with the mean as the center of one element of the partition and the variance is used to determine the length of the partion elements. Otherwise, refine the partitoin element and repeat the proocess. A large number of grouping algorithms have been formulated, however, in order to obtain the smallest footprint algorithm, one must consider the specific features of the problem domain and derive the algorithm from first principles. Ordering The point of ordering data is to prioritize alternatives according to the operatiional requirements of an objective or project. Most of the data in the external world is not ordered an so no simple assignment of numerical values is available. Ideally, one would like to find an increasing function which mapped the data or alternatives to an interval so that the of the largest or smallest value corresponeded to the optimal decision. Typically, many entities or alternatives will have the same value when constructing an "ordering" or "valuation" function. Typically this is due to the absence of precise information but may be due to the existence of two equally valid alternatives. In general, ordering algorithms are problem specific and should be derived from a requirements analysis. Linking Linking referes to establishing relationahips between entities or phenomona. The most problematic are human relationships such as linking money laundering activites to terrorist activities. These algortihms require natural language processing to discover relationships that are operationally meaningful. Within a narrow operational context, it is possible to use simple noun-verb-noun relationships to produce small footprint linking algorithms without incuring the burden of a full scale semantic analysis tool. This is because in fast moving military operations, information is not exchanged using complex language constructions such as adverbal clauses or ambiguous adjectives of adverbs. Also, the "distance" between a pronoun and its antecedent can be made short by standardizing rapid attack and defense communictions, thus reducing the problems of semantic analysis. And last, a context ontology can be used as a semantic filter to simplify the data being sent to the cognitive agent tasked with performing the link analysis. Predicting Prediction algorithms are among the most widely used and widely misunderstood of all algorithm classes. Prediction is inextricvably tied to sampling. Prediction can only make sense in the context of repeated trials. The confusion arises from the fact that society has always used the term "probability" is a casual sense (the term dates from the 15th century where as probability theory dates from the 17th century). In particular, the first definition in the dictionary is not mathematical and is independent of the subject of probability theory. When the term "probability" is used in an informal or semi formal discourse, its use does not mean that the formal sense applies as well (this is a case of semantics and context). Hence, the dictionary definition applies when one is not in the context of repeated samples or trials. A key point is that the "meaning" of the formal definition is based on the asymptotic limit of the proportion of a fixed event occurring within a series of trials. The relevant asymptotic limit is computed over an infinite series of trials and is a slowly converging ratio. Without repeated trials, the formal meaning of probability is easily confused with the meaning used in informal discourse. The consequence of this is that probability algorithms are often used when other algorithms provide a more accurate decision criteria. When probabiity algorithms are relevant, then a requirements analysis is used to determine whether parametric or non parametric methods are relevant. Also, if a parametric approach (the conventional approach) is warranted, the determination of the parametric model (Gaussian, Poisson, gamma, etc) must be made and tested at the outset. Once these decisions are made, very simple approximations to these models can be constructed and inserted into small footprint agents for simplified, high speed calculations. Additionally, by using experimental data from the operational environment, tailor made models can be constructed to fit the data over the short term and can be discarded when the statistics of the environment change. In this way, the errors due to using light weight small footprint algorithms are offset by adapting the model over time. Two methods of prediction are relevant. Looking into the future, and hypothesizing the origins of causes in the past. However, both methods require multiple samples to be relevant. The most important engineering analysis centers on formulating the prediction problem in terms of a set of samples in order that the methods of statistical inference may be used with confidence. The issue of uncertainty. The single most disconcerting aspect of prediction is what to do with uncertainty. Uncertainty appears in varying degrees in every use of an algorithm, whether it is about predicting the past, future or neither. Whenever we use a square root algorithm, there is an error or degree of uncertainty in the last decimal place. We normally assume that the last decimal place error can be modeled by the uniform distribution and that the level of uncertainty is inversely proportional to the number of decimal places. The source of confidence in the use of such numbers as the square root of two, pi, or the natural logarithm base is that we can get these numbers as right as we please because there are algorithms that can be interated as many times as we choose to get as many decimals as we choose. In essence, we feel that our model, which is an algorithm, is very accurate. But every model that is computable is an algorithm. An algorithm that computes the probability of an event yields a number that we hope is accurate at least in the first decimal place. Clarifying Data often arives with errors or ambiguities. If the data, or the errors in the data fit a model, then frequently, the model can be used remove these errors. In signal processing this is called filtering. In text processing it can be call disambiguation using context as a filter, or it can just be clarifying by adding missing data. . When data is the time series of a parameter, curve fitting is a simple method of clarifying. Polynomial curve fitting (including trigonometric polynomials and other special functions) can take on numerous forms depending on the operational objectives. Curve fitting can be sequential or batch. Sequential curve fitting are interative, such as the Kalman filter, whereas polynomial fitting is "batch". In sequential filters, the data is "cleaned" up of errors and noise one point at a time, typically, as the data is recieved. Sequential processing is appropriate for real time processing and for processing in which CPU and memory are constrained. Sequential curve fitting is also appropraite when there exist a dynamical model of the data.such as a differential or finite difference equation that cannot be solved in closed form. When a closed for does exist, curve fitting takes on a slightly different form than polynomial curve fitting in which there is no model of the flow of data. Clarifying text data requires a means of semantic analysis and preferably, an ontology as well. Semantic analysers can be simplified and tailored to the operational requirements of a mission. This requires that agents have a plug-and-play capability for changing its algorithms that is transparant to the agent infrastructure. Keeping semantic analysis algorithms within agents small requires greater reliance on an ontology. Searching Searching algorithms are a fundamental part of an operating system. Like clarifying algorithms, the size and complexity of searching algorithms is related to the ontology, and the system registrar. If objects are standardized in a registry and the obective of the agent is to find registered objects using standardized input commands, the agent algorithm is very simple and small. If inputs to the agent are informal human language, then the search agent must request the services of a clarifying agent that can access the ontology or perform semantic analysis. Since clarifying agents can be located anywhere on the system, this places no additional memory or CPU demands on the computer from which the request originates. Search agents can be built from semantic analysis agents or just request the services of a sematntic analysis agent for the purpose of discovery rather than clarification. This is a design decision requiring a trade off between speed and memory. Comparing Comparing two entities consists in checking whether two entities have the same attributes. If the attributes can be quantified, then the comparison agent returns the distance between the attributes, or the orderinfg of the attributes. Selecting Selection algorithms are based on having a model to which alternatives are compared and valued. The model is an algorithm or a table. Models that are algorithms are more compact than tables. The selection algorithms are numerous and must be designed on crtiteria that is operationally defined. This may require the services of an ordering agent mentioned above, or a valuing agent, discsuued below. The valuing agent may require the services of a clarifying agent if the value requested is submitted in informal language. Selection agents may also request the services of a comparison agent in conjunction with a valuation agent. Valuing Valuing consists in assigning a numerical value to an entity. Value algorithms are operationally dependent. The value of an entity may consist in its distance from a fixed entity or its relationships to other entities. A value agent may request the services of a linking agent in order to discover relationships. The number of relationships between a fixed entity and a set of entities may consittute the value of the entity. A value agent may provide services to an ordering agent. Examining An examining agent only determines whether a given entity has a given set of attributes of a reference model. The algorithm inputs are the reference model and the entity to be examined. The outputs are the set of attributes that are in agreement between the entity and the reference model Abstracting An abstracting agent takes as input a specific instance of an entity and outputs a generic objejct of which the original entity is an specific instance. Abstracting a parametrically chbaracterized entity has the effect of projecting the entity onto an entity having fewer attributes or dimensions. In vector space terms, this is subspace projection. In taxonomy terms, this has the effect of lifting the entity to a higher level in the taxonomy. Inputs to this type of agent are the entity and the context, which may be supplied through an ontology, or a taxonomy. Analyzing An analyzing agent decomposes an entity into its subparts. The input is the entity and the classes of possible subparts as defined by an ontology. the onotlogy may be either conceptually based or patterned based. Synthesizing A synthesizing agent takes a set of components as inputs and assembles a set of possible entities tha tcan be made from the set of components as defined in an ontology. Forming Analogies An analogy agent takes as inputs an entity and uses an ontology to form analogies with other entities. The output of this algorithm is a set of mappings from the input entity and candidate analogies that associates the supparts of the input entity to thw subparts of the candite analogy entities. |
| |||